TEO-based speaker stress assessment using hybrid classification and tracking schemes

نویسندگان

  • John H. L. Hansen
  • Evan Ruzanski
  • Hynek Boril
  • James Meyerhoff
چکیده

Speaker variability is known to have an adverse impact on speech systems that process linguistic content, such as speech and language recognition. However, speech production changes in individuals due to stress and emotions have similarly detrimental effect also on the task of speaker recognition as they introduce mismatch with the speaker models typically trained on modal speech. The focus of this study is on the analysis of stress-induced variations in speech and design of an automatic stress level assessment scheme that could be used in directing stressdependent acoustic models or normalization strategies. Current stress detection methods typically employ a binary decision based on whether the speaker is or not under stress. In reality, the amount of stress in individuals varies and can change gradually. Using speech and biometric data collected in a real-world, variable-stress level law enforcement training scenario, this study considers two methods for stress level assessment. The first approach uses a nearest neighbor clustering scheme at the vowel token and sentence levels to classify speech data into three levels of stress. The second approach employs Euclidean distance metrics within the multi-dimensional feature space to provide real-time stress level tracking capability. Evaluations on audio data confirmed by biometric readings show both methods to be effective in assessment of stress level within a speaker (average accuracy of 55.6 % in a 3-way classification task). In addition, an impact of high-level stress on in-set speaker recognition is evaluated and shown to reduce the accuracy from 91.7 % (low/mid stress) to 21.4 % (high level stress). J.H.L. Hansen ( ) · E. Ruzanski · H. Bořil · J. Meyerhoff Center for Robust Speech Systems (CRSS), University of Texas at Dallas, 800 West Campbell Rd, EC33, Richardson, TX 75080-3021, USA e-mail: [email protected]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Methods for stress classification: nonlinear TEO and linear speech based features

Speech production variations due to perceptually induced stress contribute signiicantly to reduced speech processing performance. One approach that can improve the robustness of speech processing (e.g., recognition) algorithms against stress is to formulate an objective classiication of speaker stress based upon the acoustic speech signal. In this paper , an overview of recent methods for stres...

متن کامل

Classification of speech under stress based on features derived from the nonlinear Teager energy operator

Studies have shown that distortion introduced by stress or emotion can severely reduce speech recognition accuracy. Techniques for detecting or assessing the presence of stress could help neutralize stressed speech and improve robust-ness of speech recognition systems. Although some acoustic variables derived from linear speech production theory have been investigated as indicators of stress, t...

متن کامل

Recognition of stress in speech using wavelet analysis and Teager energy operator

The automatic recognition and classification of speech under stress has applications in behavioural and mental health sciences, human to machine communication and robotics. The majority of recent studies are based on a linear model of the speech signal. In this study, the nonlinear Teager Energy Operator (TEO) analysis was used to derive the classification features. Moreover, the TEO analysis w...

متن کامل

Frequency band analysis for stress detection using a teager energy operator based feature

Studies have shown that the performance of speech recognition algorithms severely degrade due to the presence of task and emotional induced stress in adverse conditions. This paper addresses the problem of detecting the presence of stress in speech by analyzing nonlinear feature characteristics in specific frequency bands. The framework of the previously derived Teager Energy Operator(TEO) base...

متن کامل

Model Selection Based on Tracking Interval Under Unified Hybrid Censored Samples

The aim of statistical modeling is to identify the model that most closely approximates the underlying process. Akaike information criterion (AIC) is commonly used for model selection but the precise value of AIC has no direct interpretation. In this paper we use a normalization of a difference of Akaike criteria in comparing between the two rival models under unified hybrid cens...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • I. J. Speech Technology

دوره 15  شماره 

صفحات  -

تاریخ انتشار 2012